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Brief Summary Text (4) : 

The vast amount of text and other types of information available in electronic form 
have contributed substantially to an "information glut." In response, researchers 
are creating a variety of methods to address the need to efficiently access 
electronically stored information. Current methods are typically based on finding 
and exploiting patterns in collections of text. Variations among the methods and 
the factions are primarily due to varying allegiances to linguistics, quantitative 
analysis, representations of domain expertise, and the practical demands of the 
applications. Typical applications involve finding items of interest from large 
collections of text, having appropriate items routed to the correct people, and 
condensing the contents of many documents into a summary form. 

Detailed Description Text (5) : 

For one embodiment, the contextual associations of a term provide contextual 
meaning of the term. For example, the term "fatigue" can refer to human physical 
tiredness such as "Fatigue impaired. the person's judgment." Or "fatigue" can refer 
to breakdown of the structure of a material such as "Metal fatigue caused the 
aluminum coupling to break." A first aggregation of associations between term pairs 
such as: "fatigue" and "person", "fatigue" and "impaired", and "fatigue" and 
"judgment" can be clearly differentiated from a second aggregation of associations 
such as "metal" and "fatigue", "fatigue" and "aluminum", "fatigue" and "coupling", 
and "fatigue" and "break". Thus, when searching a database of subsets for subsets 
containing the notion of "fatigue" in the sense of human physical tiredness, 
subsets having greater similarity . to the first aggregation of associations are more 
likely to include the appropriate sense of "fatigue", so these subsets would be 
retrieved. Further, the contextual associations found in the retrieved subsets can 
both refine and extend the contextual meaning of the term "fatigue". 

Detailed Description Text (8) : 

Relevance ranking a collection of models is a method of quantifying the degree of 
similarity of a first model (i.e., a criterion model) and each one of the models in 
the collection, and .assigning a rank ordering to the models in the collection 
according to their degree of similarity to the first model. The same rank ordering 
can also be assigned, for example, to the collection of identifiers of the models 
in the collection, or a collection of subsets of a database represented by the 
models of the collection. The features of the criterion model are compared to the 
features of each one of the collection of other models. As will be described in 
more detail, below, the features can include the relations and the contextual 
measurements, i.e. the relational metric values of the relations in the models. The 
collection of other models. is then ranked in order of similarity to the criterion 
model. As an example: the criterion model is a model of a query. The criterion 
model is then compared to a number of models of narratives. Then each one of the 
corresponding narratives is ranked according to the corresponding level of 
similarity of that narrative's corresponding model to the criterion model. As 
another alternative, the criteria model can represent any level of text and 
combination of text, or data from the database, or combination of segments of sets 
of databases. 
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Detailed Description Text (274) : 

Phrase generation is one of several methods that display phrases contained in 
collections of text as a way to assist a user in domain analysis or query 
formulation and refinement. Phrase generation, described herein, includes an 
implicit phrase representation that can provide all possible phrases from the 
database. In contrast, other methods such as Godby (1994), Gutwin, Paynter, Witten, 
Nevill -Manning, and Frank (1998) , Normore, Bendig, and Godby (1999) , Zamir and 
Etzioni (1999) , and Jones and Staveley (1999) , maintain explicit and incomplete 
lists of phrases. In addition, phrase generation can provide the essence of 
multiple, similar phrases, which can be used as queries in a phrase search. The 
option of using the flexible matching of phrase search allows the generated query 
phrases to match both identical and nearly identical phrases in the text. This 
ensures that inconsequential differences do not spoil the match. 

Detailed Description Text (312) : 

FIG. 22D illustrates one embodiment of emphasizing the locally relevant phrases and 
de -emphasizing the globally relevant phrases in block 2238 of FIG. 22B. First the 
re-weighted model is selected in block 2260 and the processed phrases are selected 
in block 2262. Alternatively, a weight could also be determined for each one of the 
processed. phrases . The weight for each one of the processed phrases could also be 
set to a pre- selected value such as 1. A frequency of occurrence of the phrase 
within the selected relevant text could also be determined and used as the phrase 
weight. The selected phrases are then compared to the re -weighted model in block 
2264. The selected phrases are then ranked in order of relevance to the re-weighted' 
model in block 2266. The comparison in block 2264 can be a process similar to the 
comparison process in keyterm search described in FIG. 10 above. Thus, each phrase 
is modeled as a subset of the database, and the re -weighted model is used as a 
criterion model. The criterion model (that is, the re-weighted model) is compared 
with the subset models which represent the phrases to determine the degree of 
similarity of the criterion model and each of the phrase models. In addition, the 
ranking of the phrases in block 22 66 can be done using the process of ranking, 
subsets in keyterm search described above. Thus, the phrases are ranked on their 
degree of similarity to the re -weighted model. 

Detailed Description Text (326) : 

The above described methods and processes of keyterm search, phrase search, phrase 
generation and phrase discovery have been described and illustrated in terms of 
information retrieval (IR) embodiments. In IR: terms are symbols or elements of a 
data set, subsets are collections of symbols, databases are collections of subsets, 
each relation is binary and links a symbol pair, and quantification of relations is 
based on contextual associations of symbols within subsets. Further, models are 
collections of symbol relations, the models can be aggregated, the models can 
represent subsets, databases, and queries, models can be ranked on similarity to 
other models, and sequentially grouped terms are derived from models and subsets. 

Detailed Description Text (328) : 

As with term pair relations in the IR embodiment, quantification of entity pair 
relations in the real world can also be based on contextual associations. In the 
real world, the scope of that context is space, time, causality, and thought. Thus, 
the notion of context is not limited to proximity relations among symbols within a 
subset. Instead, real -world context is a much broader concept, one that is more 
fully represented by the term "metonymy" in the sense developed by Roman Jakobson 
(Jakobson, R. : "Two aspects of language and two types of aphasic 
disturbances" (1956), (pp. 95-114) and "Marginal notes on the prose of the poet 
Pasternak" (1935), (pp. 301-317), in K. Pomorska and S. Rudy (Eds.), Language in 
Literature. Belknap Press, Cambridge, Mass., 1987). Jakobson asserted that the 
interpretation of a symbol or entity is derived from both its similarity to others 
and its contextual association with others. Thus, the contextual meaning of a 
symbol or entity is determined by its connections with others in the same context. 
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that is, by its metonymic relations with others. This notion of metonymy, of 
contextual meaning, is a fundamental structural component of narrative text, symbol 
systems, and human behavior, according to Jakobson. 
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